Combining Hierarchical Clustering and Machine Learning to Predict High-Level Discourse Structure
نویسندگان
چکیده
We propose a novel method to predict the interparagraph discourse structure of text, i.e. to infer which paragraphs are related to each other and form larger segments on a higher level. Our method combines a clustering algorithm with a model of segment “relatedness” acquired in a machine learning step. The model integrates information from a variety of sources, such as word co-occurrence, lexical chains, cue phrases, punctuation, and tense. Our method outperforms an approach that relies on word co-occurrence alone.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملThe Prosody of Discourse Structure and Content in the Production of Persian EFL Learners
The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...
متن کاملCOMMIT at SemEval-2016 Task 5: Sentiment Analysis with Rhetorical Structure Theory
This paper reports our submission to the Aspect-Based Sentiment Analysis task of SemEval 2016. It covers the prediction of sentiment for a given set of aspects (e.g., subtask 1, slot 2) for the English language using discourse analysis. To that end, a discourse parser implementing the Rhetorical Structure Theory is employed and the resulting information is used to determine the context of each ...
متن کاملHierarchical Conversation Structure Prediction in Multi-Party Chat
Conversational practices do not occur at a single unit of analysis. To understand the interplay between social positioning, information sharing, and rhetorical strategy in language, various granularities are necessary. In this work we present a machine learning model for multi-party chat which predicts conversation structure across differing units of analysis. First, we mark sentence-level beha...
متن کاملStock Price Prediction using Machine Learning and Swarm Intelligence
Background and Objectives: Stock price prediction has become one of the interesting and also challenging topics for researchers in the past few years. Due to the non-linear nature of the time-series data of the stock prices, mathematical modeling approaches usually fail to yield acceptable results. Therefore, machine learning methods can be a promising solution to this problem. Methods: In this...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004